Improvement of missing genotype imputation through bi - directional parsing of large SNP panels Christine Sinoquet
نویسنده
چکیده
Such difficult analyses as disease association studies, which aim at mappping genetic variants underlying complex human diseases, rely on high-throughput genotyping techniques. However, a shortcoming of these techniques is the generation of missing calls. Computational inference of missing data represents a challenging alternative to genotyping again the missing regions. In this paper, we present SNPShuttle, an algorithm designed to gain accuracy over a former method described by Roberts and co-authors [7] (NPUTE). Given an SNP panel, NPUTE algorithm infers missing data through a single parse, relying on local similarity within sliding windows. Instead, SNPShuttle scans an SNP panel in an iterative bi-directional way, to resolve missing data with more confidence. ha l-0 03 00 59 6, v er si on 3 12 F eb 2 00 9
منابع مشابه
Iterative Two-Pass Algorithm for Missing Data Imputation in SNP Arrays
Though nowadays high-throughput genotyping techniques' quality improves, missing data still remains fairly common. Studies have shown that even a low percentage of missing SNPs is detrimental to the reliability of down-stream analyses such as SNP-disease association tests. This paper investigates the potentiality for improving the accuracy of an SNP inference method based on the algorithm forme...
متن کاملInferring missing genotypes in large SNP panels using fast nearest-neighbor searches over sliding windows
MOTIVATION Typical high-throughput genotyping techniques produce numerous missing calls that confound subsequent analyses, such as disease association studies. Common remedies for this problem include removing affected markers and/or samples or, otherwise, imputing the missing data. On small marker sets imputation is frequently based on a vote of the K-nearest-neighbor (KNN) haplotypes, but thi...
متن کاملNo hal - 00326741 October 2008 Performance analysis of methods to infer missing genotypes Christine Sinoquet
Complex analyses such as genetic mapping, disease association studies, disease mapping in the context of environmental health and environmental epidemiology studies rely on high-throughput genotyping techniques. These analyses thoroughly examine genetic variations between subjects, in particular through Single Nucleotide Polymorphism (SNP). Nonetheless, though nowadays genotyping techniques imp...
متن کاملGenetic diversity analysis of highly incomplete SNP genotype data with 6 imputations : an empirical assessment
17 Genotyping by sequencing (GBS) has recently emerged as a promising genomic approach for 18 assessing genetic diversity on a genome-wide scale. However, concerns are not lacking about the 19 uniquely large unbalance in GBS genotype data. While some genotype imputation has been 20 proposed to infer missing observations, little is known about the reliability of a genetic diversity 21 analysis o...
متن کاملEstimation of genotype imputation accuracy using reference populations with varying degrees of relationship and marker density panel
Genotype imputation from low-density to high-density (SNP) chips is an important step before applying genomic selection, because denser chips can provide more reliable genomic predictions. In the current research, the accuracy of genotype imputation from low and moderate-density panels (5K and 50K) to high-density panels in the purebred and crossbred populations was assessed. The simulated popu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009